Country-report pattern corrections of new cases allow accurate 2-week predictions of COVID-19 evolution with the Gompertz model

Accurate short-term predictions of COVID-19 cases with empirical models allow Health Officials to prepare for hospital contingencies in a two–three week window given the delay between case reporting and the admission of patients in a hospital. We investigate the ability of Gompertz-type empiric models to provide accurate prediction up to two and three weeks to give a large window of preparation in case of a surge in virus transmission. We investigate the stability of the prediction and its accuracy using bi-weekly predictions during the last trimester of 2020 and 2021. Using data from 2020, we show that understanding and correcting for the daily reporting structure of cases in the different countries is key to accomplish accurate predictions. Furthermore, we found that filtering out predictions that are highly unstable to changes in the parameters of the model, which are roughly 20%, reduces strongly the number of predictions that are way-off. The method is then tested for robustness with data from 2021. We found that, for this data, only 1–2% of the one-week predictions were off by more than 50%. This increased to 3% for two-week predictions, and only for three-week predictions it reached 10%.


Introduction
In this supplemental material we show, first, how countries have been filtered to know whether their reporting data is reliable to make the study or not (Supplementary Table S1).Then, the reporting pattern (Supplementary Figure S1) and performance analysis of the four different prediction methods (Supplementary Figures S2 to S4) can be observed for every country which passed the initial filter.Finally, Supplementary Table S2 represents which prediction days have been considered for every country and what predictions are considered unstable to compare the analysis when considering all the predictions or rejecting the unstable ones.

Country filtering: no reporting data
A list of countries belonging to EU, EFTA and UK is shown in Supplementary Table S1 with their population and number of days with no report of new cases.Those countries with more than 2 days without reporting cases are coloured in red, meaning these countries are discarded for our study due to the unreliability of the data: Cyprus, Denmark and Norway.The countries presenting one or two days without reporting data are coloured in yellow, but the data series of these countries is easily corrected by distributing the extra cases of the next day.Finally, the two countries coloured in orange (Estonia and Latvia) present such low amounts of cases that most days do not reach the new cases threshold explained in the methods sections of the main manuscript to take them into account.S1.List of countries with their population, standard deviation and peak-to-peak difference of data of new cases of COVID-19 and number of days without reporting new cases.The eliminated countries for the study are coloured in red and orange due to unreliability of data, while countries with yellow background are marked as countries with days without case reports but with such a little amount that their series are easily corrected.

Pattern analysis
The ratio between new cases on the day t and the 7-day moving average value, n 7 (t) is defined as In Supplementary Figure S1 we show the reporting patterns, depending on the day of the week, of the 23 countries taken into account in this study.S1.

4/16
3 Performance analysis of the prediction methods Four different models of prediction have been used in this study to fit the Gompertz function when considering different minimization functions: • Model B (Baseline): minimizes the error in accumulated cases.
• Model F (Fallback): minimizes the error in new cases.
• Model H (Hallmark): minimizes the error in new cases with corrected data due to their daily patterns.
• Model I (Introduction of Patterns): minimizes the error in accumulated cases with corrected data due to their daily patterns.
The accumulated relative error of the prediction fit while using the four different models can be observed in Supplementary Figure S2 as a function of the predicted day for each one of the countries which passed the initial filter.
In Supplementary Figure S3, we show the success rate of each model for each country, while different maximum relative errors are allowed (from 0.1 up to 0.5) in different predicted days (7th, 14th and 21st).In Supplementary Figure S4, we show the success rate of each model for the mean of all the predictions of all countries.
Both 2020 and 2021 are represented for each country but Sweden.Supplementary Figure S4.Success rates for the 7th, 14th and 21st predicted days as a function of an allowed accumulated relative error for all the 4 prediction methods and 2020 and 2021.
d a y T u e s d a y W e d n e s d a y T h u r s d a y F r i d a y S a t u r d a y S u n d a y d a y T u e s d a y W e d n e s d a y T h u r s d a y F r i d a y S a t u r d a y S u n d a y d a y T u e s d a y W e d n e s d a y T h u r s d a y F r i d a y S a t u r d a y S u n d a y d a y T u e s d a y W e d n e s d a y T h u r s d a y F r i d a y S a t u r d a y S u n d a y d a y T u e s d a y W e d n e s d a y T h u r s d a y F r i d a y S a t u r d a y S u n d a y Reporting patterns of the 23 EU+EFTA+UK countries that remain after applying the first filter in Supplementary Table Accumulated relative error as a function of the xth day of the prediction for all of the 23 countries in the study for 2020 and 2021.